Finding Interesting Articles

Myallo's primary function is to find articles that are interesting to you. The part of the application that does this is called the Agent.

The Agent is a process that runs in your computer which automatically looks for, and evaluates, articles that might be of interest. It uses the Interest Profile to direct its search for interesting articles. It finds candidate articles, evaluates them, predicts your level of interest, chooses the articles it feels are most interesting, and creates new Interest items that reference them in its results.

An article is anything that can be addressed by a URL that Myallo understands which returns plain text. It may be a text file on your computer, a text file on a local network or the Internet, but usually it is a page on the Internet's World Wide Web.

When you use the "Start Agent" command, the Agent uses the check marked Interests in your profile to determine the basic things to search for. When you use "Fast Lookup", it uses the a single search string you enter.

Myallo uses a set of search sites to determine where to begin its searches.

A search site is a service that accepts a query in URL format and returns text containing other URLs that point to articles which satisfy the query. Many Internet "search engines" can be used as search sites. Myallo's built-in searchsites can search for Web pages on the Internet. If you have Mac OS X 1.4 or above, Myallo can also use the system Spotlight feature to very quickly search the text content of all the local files on your computer.

A search engine is a service that can help find the locations of articles that pertain to a particular category. You pass them a URL that includes a search string and they return a document containing the URL locations of articles which pertain to that string. Myallo passes the Interest name to the search site as the search string, and usually the site returns the locations of Web pages that contain the specified string.

Search sites can be added or changed with the Preferences command in the Myallo menu. The "Adjusting the Search Sites" section explains more about starting points and search engines.

Myallo contains a set of search sites built in. One search sites is enabled by default - it utilizes a general Internet search engine - but you can enable any number of sites via the Preferences command.

Myallo does not necessarily work better with any particular search engine - that all depends on what your interests are, what the engine covers, and what the engine itself might find on a topic. Myallo can use many search engines. Some are of general use, others may specialize in finding particular types of information. Set up Myallo to use whatever engine or engines you like best. Third parties can write searchsite definition files that allow Myallo to use new engines. There is information on how to do so in the appendix.

 

What Makes an Article Interesting

When the Agent starts evaluating an article, it looks at each interest in your profile to see if the article's text contains the exact name of the Interest (or matches an advanced search string, if one is set for that interest item.)

When an article is found to match one of your Interests, the predicted interest level for that article is adjusted according to how interesting you said the matching Interest is, how often the article matches, where the matches are in the body of the article, and other things. For example, if you have a Interest about "gardening" set to a high level, the more times an article mentions that word, the higher the Agent will predict your interest to be. But if you have an interest named "vegetable" set to a low Interest (the slider is left of center) the article's predicted interest level will be lowered instead of raised. Finally, the more Interests an article matches, the more the predicted interest level will be affected.

 

The Myallo Agent

The "Start Agent" and "Fast Lookup..." commands in the Agent menu will begin a search session. The manner of searching can be modified, but generally, the Agent will go online and search for articles on the Internet's World Wide Web.

Accessing the Internet

By default, the agent will connect to the Internet and use a search site to find articles on the Web that contain Interest names from the profile. For example, if you have an interest for "gardening", the agent will ask the search site for all articles that contain the word "gardening."

The search site will probably come up with thousands of possibilities. However, the first articles in the list usually will contain multiple mentions of the word, or mention it right near the beginning of the article, or mention all the words in a phrase, or are from popular sites, or sites that many Web pages link to. Since these are of the most potential interest, some articles nearer the beginning of the list are selected as candidates for evaluation. Since Myallo only scans plain text and HTML, it will skip results that are pictures, PDF files, media files and the like.

Web pages often contain links to other pages. Since links on an interesting page tend to lead to other interesting pages, Myallo may follow some of those links to find even more candidate articles for evaluation. It may follow links on those linked articles, and so on. The parameters of this part of the search are partially controlled by the "width" and "depth" settings in the preferences.

  

Using the Agent

The steps for using the Myallo agent are:

  1. Set up an Interest Profile.

  2. If desired, change the Preferences to adjust how the agent will perform its search.

  3. Select the "Start Agent" or "Fast Lookup..." command in the Agent menu. By default, a status drawer will appear which shows the activities of the Agent as it works.

  4. If you want the Agent to stop early, select the "Stop Agent" command in the Agent menu. Otherwise, the Agent will finish by itself eventually, or when it hits one of the limits set in the Preferences. If you want the agent to stop accessing the Internet or using CPU time for a while, without ending the session, you can pause with the "Pause/Continue" command in the Agent menu.

The results of the session appear in the profile window as the Agent works. You can begin browsing the results while Myallo continues to work.

As the Agent works, result items in the profile may appear and disappear. The Agent only retains the most interesting articles, so as the search progresses, less interesting articles can drop out of the list as articles that are more interesting are found.

 

The Status Drawer

While the Agent works, you can view the progress in the status drawer. If it isn't already showing, the "Agent Status" command in the Agent menu will make it appear. There is also an icon in the window toolbar that can toggle the drawer, and the preferences can be set to open the drawer when a session starts: 

 

Status: This shows what the agent is generally doing. It may be Starting, Running, Pausing, Stopping, or Stopped.

Time: This shows the time the session has been running and the time limit for the session. As the session progresses, the bar slowly extends towards the limit. The time limit is set in the Preferences. If the Agent reaches the time limit, it will end the session.

Evaluations: This shows how many articles the agent has read and evaluated so far, and the limit. If the agent evaluates its limit, it will end the session. The limit is set in the Preferences.

The above items show how far along the agent has gotten in its session. The remaining items show more details about what the agent is doing.

Candidates: Myallo starts by calling on search sites to get some candidate articles to use as starting points. This item shows the number of candidate URLs it has obtained that are waiting to be read. Not all results from the search engines are selected as candidates. Those which are, have their URLs placed in a queue - a holding area for the URLs until the articles are read by the agent. The number at the left side of the bar indicates how many candidate URLs are currently waiting to be read. The "High" value at the right of the bar indicates the maximum number of URLs that have been waiting at one time. The bar shows how the ratio between the high water mark and the current queue size.

Articles: Myallo takes URLs from the candidate queue, and reads them in. The article text or HTML is saved in an article queue for evaluation. The number at the left side of the bar shows how many articles are currently waiting to be evaluated. Similar to the candidates bar, the "High" number is the high water mark for this queue, and the bar shows the relationship between the two numbers.

Evaluated: Myallo scans the items in the articles queue, evaluating them to predict an interest level. Once they are evaluated, they become possible results. The number at the left of the bar shows how many articles have been evaluated but are still waiting to be processed as possible results. The "High" value is the high water mark for this queue, and the bar shows the relationship between the two. Note the number here shows how many evaluated articles are waiting for processing, while the Evaluations bar above counts the total number of evaluations made.

Results: Myallo checks the evaluated articles and keeps a list of them. It takes the current most interesting results and places them in the interest profile. As new evaluations come in, it keeps the profile updated. The number at the left of the bar shows how many results are currently shown in the profile. The number at the right shows the total number of evaluated articles being considered for presenting as results.

Average Interest: This slider shows the average interest level for all the articles that have been evaluated in this session. Note it is not an average for the results presented in the profile (the "header" interest topping the results in the profile shows that) but the average of all interests that have been evaluated (the number at the left of the "Evaluations" bar indicates how many interests were evaluated.

Though articles being evaluated sort of "flow" from top to bottom through the queues these bars are displaying, articles may be rejected along the way for being uninteresting, so the numbers may not always match up.

 

Pausing the Agent

You can ask the Agent to pause, and make it continue later. The Agent tends to keep your Internet connection and your computer busy. Pausing is a way to make it stop using system resources while you do something else.

To pause, select the "Pause/Continue" command in the Agent menu. Selecting it again will make the Agent continue. There is also a Pause/Continue icon in the window toolbar which will alternately pause and continue the agent.

 

Stopping the Agent

The Agent will finish of its own accord when it has evaluated every article it can find, or (more likely) when it reaches its session time limit or evaluation limit. However, you can cause the agent to finish immediately at any time, by stopping the session.

Select the "Stop Agent" command in the Agent menu to make the Agent stop immediately. There is also a Start/Stop icon in the window toolbar that will stop the session if it is in progress.

The Agent will also stop if you close the profile window or quit the application.

 

Viewing the Results

The agent places new interest items representing the results it produces into a section at the top of your interest profile. All the Interests in the figure below except the three with the check marks were inserted by the Agent during its session:

 

 

When the Agent starts, it places an Interest at the top the profile that includes the date and time of the session. The results from that session are placed underneath as sub-interests. In this case, five results were produced. Each of the results represent one of the articles the Agent predicted to be most interesting. The Location column shows the URL of the article.

To view one of the articles, select it and press Return. You can also select the "Open Interest" item in the Interest menu, or double click the interest in the "Opened" column. Myallo will cause your Internet browser to display the article. The "Opened" column shows when you last opened an interest in this way. If you used the Spotlight searchsite to find local files on your computer, the file is opened using whatever application is appropriate.

You can also peek at an interest's page or file in the Content drawer. Select the "Interest Content" command in the Interest menu to open the drawer. Myallo will load the page for whatever interest you select in the profile, and show it in the drawer. You can enlarge the drawer by pulling on its right hand edge or making the window taller. This is very handy to peek at the page when you are adjusting interest levels on the results, but the relatively small size of the drawer makes it difficult to view the whole page. To read pages in full, open the interest by pressing Return.

Below, we see a glimpse of the selected interest's page in the contents drawer:

The sliders show Myallo's prediction of your level of interest for each result article. The results are sorted from most to least interesting. After you read the article, make sure to correct the Agent's prediction if necessary, in order to teach the Agent. Drag the slider for that article's Interest slightly to the left or right to make the correction, and the Agent will propagate this feedback throughout the profile to improve its accuracy in future sessions. Don't overdo a correction, especially if you are lowering an interest level. The midpoint of the slider is an "normal" indication, and levels left of center will actually cause Myallo to be repelled by matches, to some degree.

The Interests the Agent added that are aliases refer to the result Interests. The aliases serve two functions.

The aliases under each of your interests show which result articles matched that interest topic. In the example, the "flower" interest has several aliases under it. This means that the word "flower" appeared in those result articles. No aliases appear under the vegetable interest, and this means that term did not appear in the selected result articles. That isn't really a surprise, since articles with that topic mentioned in them were probably rejected due to the low interest level "vegetable" was given.

The four aliases at the bottom of the profile are connected to the "gardening interest. This can be seen by their indent level.

The other function of aliases is to tell Myallo which of your interests to adjust when you make a correction to a result article's predicted interest level. For example, if you adjust the slider for the first result, the Myallo knows to adjust your "flower" interest, because of the alias under it. Actually, your "gardening" interests would also get adjusted by a lesser amount because it is the "parent" of the flower interest. And in turn, the "vegetable" interest, even though it is set to a negative level, would get adjusted by a very tiny amount, because it is a child of the gardening interest. Your adjustments propagate throughout the profile in this manner as part of the learning process. The aliases, as well as the structure of the interests in the profile and the interest settings, control how Myallo makes propagates these adjustments, and by what amounts.

Note that you should keep using the same profile again and again, and your adjusting the result interest levels will help Myallo learn. As you run sessions, results will accumulate at the top of the profile. You can leave them there, but it is also OK to delete them from the profile. This won't affect the learning, which is stored in the main interests rather than the results. You can click on a "Results" interest to select it, and delete it (and all the individual results under it, and all the aliases to these results scattered throughout the profile) by pressing the Delete key or choosing Delete from the Edit menu.